Association Rules Mining Using Heavy Itemsets

نویسندگان

  • Girish Keshav Palshikar
  • Mandar S. Kale
  • Manoj M. Apte
چکیده

A well-known problem that limits the practical usage of association rule mining algorithms is the extremely large number of rules generated. Such a large number of rules makes the algorithms inefficient and makes it difficult for the end users to comprehend the discovered rules. We present the concept of a heavy itemset. An itemset A is heavy (for given support and confidence values) if all possible association rules made up of items only in A are present. We prove a simple necessary and sufficient condition for an itemset to be heavy. We present a formula for the number of possible rules for a given heavy itemset, and show that a heavy itemset compactly represents an exponential number of association rules. We present an efficient greedy algorithm to generate a collection of disjoint heavy itemsets in a given transaction database. We then present a modified apriori algorithm that uses heavy items and detects more heavy itemsets, not necessarily disjoint with the given ones.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Soft Set Theory for Mining Maximal Association Rules in Text Data

Using soft set theory for mining maximal association rules based on the concept of frequent maximal itemsets which appear maximally in many records has been developed in recent years. This method has been shown to be very effective for mining interesting association rules which are not obtained by using methods for regular association rule mining. There have been several algorithms developed to...

متن کامل

Negative and Positive Association Rules Mining from Text Using Frequent and Infrequent Itemsets

Association rule mining research typically focuses on positive association rules (PARs), generated from frequently occurring itemsets. However, in recent years, there has been a significant research focused on finding interesting infrequent itemsets leading to the discovery of negative association rules (NARs). The discovery of infrequent itemsets is far more difficult than their counterparts, ...

متن کامل

Preknowledge-based generalized association rules mining

The subject of this paper is the mining of generalized association rules using pruning techniques. Given a large transaction database and a hierarchical taxonomy tree of the items, we attempt to find the association rules between the items at different levels in the taxonomy tree under the assumption that original frequent itemsets and association rules have already been generated in advance. T...

متن کامل

An Algorithm for Mining High Utility Closed Itemsets and Generators

Traditional association rule mining based on the support-confidence framework provides the objective measure of the rules that are of interest to users. However, it does not reflect the utility of the rules. To extract non-redundant association rules in support-confidence framework frequent closed itemsets and their generators play an important role. To extract non-redundant association rules a...

متن کامل

An Algorithm of Constrained Spatial Association Rules Based on Binary

An algorithm of constrained association rules mining was presented in order to search for some items expected by people. Since some presented algorithms of association rules mining based on binary are complicated to generate frequent candidate itemsets, they may pay out heavy cost when these algorithms are used to extract constrained spatial association rules. And so this paper proposes an algo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Data Knowl. Eng.

دوره 61  شماره 

صفحات  -

تاریخ انتشار 2005